The Use of Simulated Experts in Evaluating Knowledge Acquisition
نویسنده
چکیده
Evaluation of knowledge acquisition methods remains an important goal; however, evaluation of actual knowledge acquisition is difficult because of the unavailability of experts for adequately controlled studies. This paper proposes the use of simulated experts, i.e., other knowledge based systems as sources of expertise in assessing knowledge acquisition tools. A simulated expert is not as creative or wise as a human expert, but it readily allows for controlled experiments. This method has been used to assess a knowledge acquisition methodology, Ripple Down Rules at various levels of expertise and shows that redundancy is not a major problem with RDR. Introduction Evaluation of knowledge acquisition (KA) methods remains an important goal. Many KA methods have been proposed and many tools have been developed. However, the critical issue for any developer of knowledge based systems (KBS) is to select the best KA technique for the task in hand. This means that papers describing methods need to provide convincing evidence of the particular advantage of the method over other methods and clear identification of the problems and weaknesses of a method. Unless this clear evidence is provided it is very hard to be sure whether or not to believe the author, who with the best will in the world, is mainly concerned to highlight the advantages he or she believes are provided by the new method they are proposing. As an example, it seems there are still very few case studies of maintenance problems with KBS, e.g. (Bachant and McDermott 1984; Compton, Horn et al. 1989) . The problem for KA researchers is that they need to demonstrate results on actual KA from experts. Obviously it is expensive to use experts for other than real applications and they are not readily available for controlled studies. The closest to a controlled scientific evaluation of KA so far seems to be Shaw's study of different experts using KSSO (Shaw 1988) . However the aim of this study was to investigate variability in how experts provide knowledge rather than to evaluate a KA method. The study suggested that experts organised their knowledge of the same domain quite differently from each other and the same expert was likely to vary his or her knowledge organisation on repeat experiments. Clearly any study using experts needs to take into account the variability between experts as well as the difficulty of repeat experiments on the same experts, whereby they become more expert at contributing to a KBS. These are standard problems in empirical science, but are major stumbling blocks in KA because of the cost and unavailability of experts. Experts by definition are people whose expertise is scarce and valuable. One solution to this problem is to pick tasks for which many people have significant expertise so that experts are readily available. Little work appears to have been done on this approach and in our own experience it is very difficult to identify suitable tasks. Another approach is simply to report on how methods have been used on a wide range of real world systems. This is useful but less than ideal as it only applies to established methods not new research and it is very difficulty to quantify and compare. New methods are difficult as one must first convince an organisation of the advantages of using a hitherto untested approach. It normally happens because the developer is part of the organisation, which hardly provides a good controlled environment. However, clearly some standards on how application work should be reported to make comparison possible is desirable. The major attempts to evaluate KA to date are the Sisyphus projects (Linster 1992; Gaines and Musen 1994) . In these projects a sufficient paper specification of a problem has been provided so that a KBS solution could be implemented without further information being required. These studies have been very valuable because they have resulted in papers describing the development of a variety of solutions to the same problem. The basis for comparison has been informal but very interesting, even resulting in joint papers where authors contrast their methods (Fensel and Poeck 1994) However, these papers necessarily have no information on actual KA. All the relevant knowledge was already in a paper specification. What the studies are concerned with is identifying and perhaps building problem solving methods suitable for the described problem, and developing an appropriate domain model and perhaps even a KA tool suited to the problem. They are not concerned with the further process of actually acquiring the knowledge to go into the knowledge base. For a system fully specified on paper in a single small document, acquisition of the knowledge is trivial once the problem solving method and domain model have been developed. However, this does not seem to be the case with real KBS projects. This paper proposes the notion of using another KBS as a simulated expert from which knowledge can be acquired. The KA method or tool is used to acquire knowledge from the simulated expert and to build a new KBS which should have the same competence as the KBS from which the simulated expert is derived. Instead of asking a human expert what the reasons are for reaching a particular conclusion, etc., one asks the simulated expert whose source of expertise is a previously built expert system for the domain. The obvious advantage of such an approach is that endless repeat experiments are possible and the experimenter has complete control over all the variables. A weakness of this approach is that a data model is already given or will be very easily derived, whereas with a human expert this may be more difficult. We mean here the data model required for communicating with the user and/or acquiring the data about a particular case to be processed. We are not concerned with further abstraction that may appear attractive and may be useful internally in the KBS. Perhaps a new data model will be developed, but there is an already implemented model and the chances are that the new system will use an identical model. However, this does not seem a major lack as the development of an appropriate data model is itself a major concern of conventional software engineering and knowledge engineering seems to offer little to this except that the development of the data model is integrated into the overall knowledge engineering process. The interesting question of deciding on a problem solving method still remains. The simulated expert KBS of course has a specific problem solving method, but this is not necessarily apparent, nor need it be reproduced in the new KBS. The key issue in using the simulated expert, is what type of knowledge it provides. The knowledge provided by the simulated expert essentially comes from its explanation component. The explanation component may provide a way of browsing the system or it may provide explanations that differ from its reasoning in reaching a conclusion, but most likely it is going to provide some sort of activation or rule trace. This further implies a set of cases to exercise the simulated expert KBS. However, if such a KBS exists, the chances are suitable cases can be made available. Because of the likely use of cases to exercise the simulated expert, this approach relates to machine learning evaluation. In machine learning evaluation extensive use has been made of date bases of cases. The performance of different methods has been able to be compared by their performance on learning from these databases. Some of these databases are used in the studies described below. The crucial difference between KA and ML evaluation is that ML uses the raw data of the cases to derive a KBS, whereas for KA evaluation the simulated expert's explanation of its conclusion is used to build the new KBS. ML is concerned with identifying important features from data. KA is concerned with organising knowledge about the important features provided by the expert. Clearly different styles of KBS and different explanation facilities are going to provide quite different evaluations of different strengths and weaknesses of various KA systems. Also, the evaluation may use some or all of the knowledge provided by the simulated expert to provide different levels of expertise. The major weakness apparent in this approach to evaluation is that the simulated expert has no meta-knowledge. It can't report that it thinks it has told the knowledge engineer everything that is important and can't reorganise its knowledge presentation to suit the desires of the knowledge engineer etc. However, these are also, at least partially, weakness of human experts, so it is probably reasonable to use a simulated expert that has no ability in this regard. However, KA methods that rely heavily on the meta-knowledge abilities of the expert will have problems with this approach to evaluation. Experimental Studies Aim This paper describes the application of a simulated expert to evaluating the Ripple Down Rule (RDR) methodology. A frequent question raised with respect to RDR is the level of redundancy in the KBS and the importance of the order in which cases are presented. This question is explored with respect to three different domains and three different simulated expert KBS for each domain, three different levels of expertise and a number of different orderings of the data presented. The three different domains are the Tic-Tac-Toe and Chess End Game problems from the UC Irvine data repository and the Garvan thyroid diagnosis problem, also included in the Irvine repository, but here based on a larger data set. The KBSs used for the simulated expert were built by induction from the same data sets using the C4.5, Induct and the RDR version of Induct machine learning algorithms. Introduction Ripple Down Rules (RDR) RDR is a KA methodology and a way of structuring knowledge bases which grew out of long term experience of maintaining an expert system (Compton, Horn et al. 1989) . What became clear from this maintenance experience is that when an expert is asked how they reached a particular conclusion they do not and cannot explain how they reached their conclusion. Rather they justify that the conclusion is correct and this justification depends on the context in which it is provided (Compton and Jansen 1990) . The justification will vary depending on whether the expert is trying to justify their conclusion to a fellow expert, a trainee, layperson or knowledge engineer etc. This viewpoint on knowledge has much in common with situated cognition critiques of artificial intelligence and expert systems but here leads to a situated approach to KA. The RDR approach was developed with the aim of using the knowledge an expert provided only in the context within which it was provided. For rule based systems it was assumed that the context was the sequence of rules which had been evaluated to give a certain conclusion. If the expert disagreed with this conclusion and wished to change the knowledge base so that a different conclusion was reached, knowledge was added in the form of a new rule of whatever generality the expert required, but this rule was only evaluated after the same rules were evaluated with the same outcomes as before. With this approach rules are never removed or corrected, only added. All rules provide a conclusion, but the final output of the system comes from the last rule that was satisfied by the data. Initial experiments with this approach were based on rebuilding GARVAN-ES1, an early medical expert system (Horn, Compton et al. 1985; Compton, Horn et al. 1989) . This system was largely rebuilt as a RDR system and it was demonstrated that rule addition of the order of 20 per hour could be achieved and with very low error rates(Compton and Jansen 1990) . It was realised that the error rate could be eliminated by validating the rules as they were added (Compton and Preston 1990) . A valid rule is one that will correctly interpret the case for which it is added and not misinterpret any other cases which the system can already interpret correctly. One possibility is to store all the cases seen or their exemplars (an approach used by Gaines in his version of RDRs (Gaines 1991a) ) and check that none of these are misinterpreted. The approach on which most RDR work has been based is to check that none of the cases that prompted the addition of other rules are misinterpreted. These cases are stored they are the "cornerstone cases" that maintain the context of the knowledge base. In fact only one of these cases has to be checked at one time, the case associated with the rule that gave the wrong classification. To ensure a valid rule, the expert is allowed to choose any conjunction of conditions that are true for the new case as long at least one these conditions differentiates the new case from the cornerstone case that could be misclassified. To ensure this the expert is shown a difference list to choose from (Compton and Preston 1990) , a list of differences between the current case and the case attached to the last true rule. RDR have also been used for the PEIRS system described below (Edwards, Compton et al. 1993) . They have also been used for a configuration task, but in this case a number of RDR knowledge bases were built inductively and a further algorithm was developed to reason across the various knowledge bases(Mulholland, Preston et al. 1993) . The version of RDR was a simple C implementation running under Unix. Data Sets The following data sets were used. Chess and Tic Tac Toe are from the University of California Irvine Data Repository. The Garvan data set comes from the Garvan Institute of Medical Research, Sydney Chess Chess End-Game -King+Rook versus King+Pawn on a7. 36 attributes for 3196 cases and 2 classifications. TicTacToe Tic-Tac-Toe Endgame database. This database encodes the complete set of possible board configurations at the end of tic-tac-toe games. 9 attributes for 958 cases and 2 classifications. Garvan Thyroid function tests. A large set of data from patient tests relating to thyroid function tests. These case were run through the Garvan-ES1 expert system (Horn, Compton et al. 1985) to provide consistent classifications. The goal of any new system would be to reproduce the same classification for the cases. 32 attributes for 21822 cases and 60 different classifications. These are part of a larger data set of 45000 cases covering 10 years. The cases chosen here are from a period when the data profiles did not appear to be changing over time and could be reasonably reordered randomly (Gaines and Compton 1994) . The Garvan data in the Irvine data repository is a smaller subset of the same data. The Garvan data consists largely of real numbers representing laboratory results. Using the Garvan-ES1 preprocessor these were reduced to categories of high, low etc as used in the rules in the actual knowledge base. The preprocessed data was used in the studies below. Machine Learning Methods C4.5 (Quinlan 1992) is a well established machine learning program based on the ID3 algorithm. The extensions to the original ID3 are that it deals with missing data, real numbers, provides pruning and allows the KBS to be represented as a tree or rules, with some consequent simplification. The version of C4.5 used was provided by Ross Quinlan. It was used with the default settings, as the aim was not produce the best possible KBS but a reasonable simulated expert. There were no real numbers in the data but a lot of missing data in the Garvan data set. Induct (Gaines 1989) is based on Cendrowska's Prism algorithm (Cendrowska 1987) . Induct can produce either flat rules or RDRs (Gaines 1991a) . Both versions of Induct were used. The versions used were provided by Brian Gaines as part of the KSSn system. The RDR representation is generally more compact (Gaines and Compton 1992) . Induct does not handle real numbers at this stage, but deals with missing data and provides pruning. No pruning was used in this study. Although C4.5 and Induct both perform similarly there are important differences in their underlying algorithms. C4.5 attempts to find an attribute to go at the top of the decision tree whose values best separate the various classifications as assessed by the information calculation used. This separation is estimated as the best overall separation, so that there is no requirement that any leaf should contain only one class or a particular class. This process is repeated with the cases at each leaf. In contrast, Induct selects the most common classification and attempts to find an attribute value pair that provides the best selector for cases with this classification. Further attributes value pairs are added to the rule to improve the selection as long as this is statistically warranted. The process is repeated for the remaining cases. The difference with the RDR version of Induct is that it repeats the process separately for cases that are incorrectly selected by the rule and those that the rule does not select resulting in an RDR tree. Experimental Method An RDR KBS is built by correcting errors, by adding new rules for cases which have not been given the correct classification. To do this the expert selects relevant conditions from the difference list for that case. The method used here is identical except that any expertise used in selecting important conditions from the difference list is provided from the rule trace from another expert processing the same case. It should not be expected that the simulation will perform better than a real expert or the machine learning techniques on which it rests. The best that could hope to be achieved is defined by the accuracy of the simulated expert (this essentially becomes a measure of the performance of base machine learning technique). Real experts do however perform better than machine learning techniques when there are small data sets (Mansuri, Compton et al. 1991) , and in general a little knowledge can replace a lot of cases for machine learning (Gaines 1991b) . The following steps are required:
منابع مشابه
Evaluating the Effectiveness of Deductive and Inductive Form-Focused In-struction on Iranian EFL Learners' Implicit and Explicit Knowledge of Non-Generic Definite Article
This study investigated the relative effects of deductive and inductive form-focused instruction (FFI) on the acquisition of 4 non-generic definite article uses(cultural, situational, structural, and textual) as assessed by explicit and implicit outcome measures. The tests utilized to assess EFL learners' acquisition of definite article uses were timed and untimed grammaticality judgment tests....
متن کاملEvaluating the Effectiveness of Explicit and Implicit Form-Focused Instruction on Explicit and Implicit Knowledge of EFL Learners
Although explicit and implicit knowledge of language learners are essential to theoretical and pedagogical debates in second language acquisition (SLA), little research has addressed the effects of instructional interventions on the two knowledge types (R. Ellis, 2005).This study examined the relative effectiveness of explicit and implicit types of form-focused instruction (FFI) on the acquisit...
متن کاملFactors affecting the acquisition of expert tacit knowledge Case study: Delivery time in twin pregnancy
This paper discovers the necessary variables need for creating models for tacit knowledge acquisition, especially in medical care services. The case studied here, was knowledge of diagnosing and time of delivery in twin pregnancy with nuchal translucency screening. This paper covers the empirical work undertaken on semi-structured interview based on thematic analysis. With regard of theoretical...
متن کاملAnalyzing Knowledge Management of Experts and Managers in Agricultural College of Islamic Azad University, Shoushtar Branch
The purpose of this study was analyzing perception of agricultural college experts and managers of Islamic Azad University Shoushtar Branch regarding development of knowledge management. Agricultural college experts and managers were considered as a statistical population (N=38). This research was conducted in June 2012 to April 2013. All individuals were investigated. After confirm the validit...
متن کاملThe Effect of Knowledge Management on Organizational Entrepreneurship among Agricultural Extension Experts in Kermanshah Province, Iran
In today's turbulent business environment, organizations face the need to rapidly respond to demands, explore new opportunities, apply evolving technologies, and create novel competitive advantages. Knowledge Management (KM) and Organizational Entrepreneurship (OE) are two strategic tools through which companies can concurrently improve their competitive advantage while seeking new potential op...
متن کاملDesigning a local Flexible Model for Electronic Systems Acquisition Based on Systems Engineering, Case Study: Electronic high-tech Industrial
In this research we have presented a local model for implementing systems engineering activities in optimized acquisition of electronic systems in Electronic High-Tech Industrial. In this regard, after reviewing the literature and the use of documents, articles and Latin books, we have collected system acquisition life cycle models from different resources. after considering the criteria of the...
متن کامل